Collocation Extraction for Machine Translation
نویسندگان
چکیده
This paper reports on the development of a collocation extraction system that is designed within a commercial machine translation system in order to take advantage of the robust syntactic analysis that the system offers and to use this analysis to refine collocation extraction. Embedding the extraction system also addresses the need to provide information about the source language collocations in a system-specific form to support automatic generation of a collocation rulebase for analysis and translation.
منابع مشابه
Using collocation segmentation to extract translation units in a phrase-based statistical machine translation system
This report evaluates the impact of using a novel collocation segmentation method for phrase extraction in the standard phrase-based statistical machine translation approach. The collocation segmentation technique is implemented simultaneously in the source and target side. The resulting collocation segmentation is used to extract translation units. Experiments are reported in the Spanish-toEng...
متن کاملUsing collocation segmentation to extract translation units in a phrase-based statistical machine translation system Implementación de una segmentación estad́ıstica complementaria para extraer unidades de traducción en un sistema de traducción estad́ıstico basado en frases
This report evaluates the impact of using a novel collocation segmentation method for phrase extraction in the standard phrase-based statistical machine translation approach. The collocation segmentation technique is implemented simultaneously in the source and target side. The resulting collocation segmentation is used to extract translation units. Experiments are reported in the Spanish-toEng...
متن کاملCollocation Translation Acquisition Using Monolingual Corpora
Collocation translation is important for machine translation and many other NLP tasks. Unlike previous methods using bilingual parallel corpora, this paper presents a new method for acquiring collocation translations by making use of monolingual corpora and linguistic knowledge. First, dependency triples are extracted from Chinese and English corpora with dependency parsers. Then, a dependency ...
متن کاملArabic Collocation Extraction Based on Hybrid Methods
Collocation Extraction plays an important role in machine translation, information retrieval, secondary language learning, etc., and has obtained significant achievements in other languages, e.g. English and Chinese. There are some studies for Arabic collocation extraction using POS annotation to extract Arabic collocation. We used a hybrid method that included POS patterns and syntactic depend...
متن کاملInferring Semantics from Collocation Clusters to Represent Verbs and Nouns
Current lexical semantic theories provide representations at a coarse grained level. In this paper, I will provide motivations for a fine grained representation for verbs and. nouns. An initial case study is done to serve as evidence that a more detailed representation is needed for tasks that require high accuracy rates, such as machine translation. An automatic approach to gather fine grained...
متن کامل